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INTRODUCTION 


"Forecasting  and  Time  Series  Model  Types  of  111  Economic  Time  Series" 
is  a  chapter  to  be  published  in  a  book  Major  Time  Series  Methods  and  Their 
Relative  Accuracy  by  S.  Makridakis,  A.  Andersen,  R.  Carbone,  R.  Fildes, 

M.  Hibon,  R.  Lewandowski,  J.  Newton,  E.  Parzen,  and  R.  Winkler,  Wiley:  London, 
1983.  It  reports  in  detail  the  forecasting  procedure  followed  by  Parzen  and 
Newton  in  their  participation  in  the  forecasting  "competition"  whose  results 
are  reported  in  Makridakis,  S.,  et  al  (1982)  "The  Accuracy  of  Extrapolation 
(Time  Series)  Methods:  Results  of  a  Forecasting  Competition,"  Journal  of 
Forecasting,  1,  111-153. 

The  joint  paper  did  not  explicitly  draw  any  conclusions  concerning 
which  methods  performed  best.  Commentaries  on  the  joint  paper  (to  appear  in 
1983  in  the  Journal  of  Forecasting)  seem  to  acknowledge  the  excellence  of 
the  forecast  errors  obtained  by  Parzen  and  Newton.  David  J.  Pack  points  out 
the  desirability  of  increasing  the  numeracy  of  the  joint  paper's  Table  2(b), 
which  provides  MAPE  measures  of  how  well  each  forecasting  method  performed 
for  the  entire  111  series  sample  [reproduced  in  Pack's  Exhibit  1).  Pack's 
Exhibit  2  is  the  same  table  with  methods  ordered  to  the  "average  of 
forecasting  horizons  1-12"  column,  and  all  MAPE's  divided  by  13.4,  the  minimum 
MAPE  in  the  ordering  column. 

We  reproduce  Pack's  Exhibits  1  and  2.  Readers  must  draw  their  own 
conclusions  concerning  the  superiority  of  the  forecasting  methods  used  by 
Parzen  and  Newton.  Our  contribution  to  the  commentaries  on  the  joint  paper 
is  printed  at  the  end  of  this  report  with  the  title  "How  to  Learn  from  the  JoF 
Competition." 
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1.  Introduction 

"Is  It  possible  to  put  an  end  to  the  argument  of  what 
forecasting  methods  are  better  and  under  what  circumstances?",  is 
the  auestion  raised  by  Professor  Spyros  Makridakis  in  several 
stimulating  papers  (1976),  0978),  0979).  He  has  organized  a 
"forecasting  competition"  to  which  various  forecasting  experts 
would  contribute  forecasts  of  111  economic  and  business  time  series 
which  he  has  collected.  This  paper  reports  the  results  of  our 
analysis  of  these  series,  based  on  the  general  approach  to  time 
series  modeling,  spectral  analysis,  and  forecasting  developed  by 
Parzen,  with  the  collaboration  of  Newton. 

An  appendix  describes  the  theory  of  univariate  time  series 
modeling  and  forecasting  used  in  this  study.  The  main  text 
summarizes  the  diverse  models  which  are  encompassed  by  our 
approach,  and  which  arise  in  the  study  of  the  111  time  series  being 
forecasted. 

The  methods  of  time  series  modeling  and  forecasting  applied  in 
this  paper  can  be  applied  automatically  but  they  are  not  rote 
formulas,  since  they  are  based  on  a  flexible  philosophy  which 
provides  several  models  for  consideration  and  diverse  diagnostics 
for  qualitatively  and  quantitatively  checking  the  fit  of  a  model 
(see  Parzen  (1979),  (1980),  (1981)).  The  models  considered  are 
called  ARARMA  models  because  the  model  computed  adaptively  for  a 
time  series  is  based  on  sophisticated  time  series  analysis  of  ARMA 
schemes  (a  short  memory  model)  fitted  to  residuals  of  simple 
extrapolation  (a  long  memory  model  obtained  by  parsimonious  "best 
lag"  non-stationary  autoregression). 

A  consumer  of  time  series  forecasting  and/or  modeling  methods 
must  evaluate  the  value  of  a  proposed  procedure  in  the  context  of 
the  actual  time  series  with  which  he,  or  she,  is  concerned.  Our 
approach  aims  to  be  applicable  in  all  the  diverse  fields  to  which 
time  series  analysis  is  being  applied. 

A  major  problem  of  time  series  forecasting  ..is  whether  long 
range  forecasting  and  short  range  forecasting  require  different 
methods  to  obtain  satisfactory  forecasts.  This  paper  describes 


iterated  mod  els  which  provide  qualitative  diagnostics  as  to  the 
possibility  of  long  range  forecasts  (by  diagnosing  whether  the  time 
series  is  long  memory).  Both  long  range  and  short  ranfee  forecasts 
are  provided  by  a  model  obtained  by  fitting  a  parsimonious 
non-stationary  autoregression  whose  residuals  Y(t)  are  modeled  by  a 


2 


stationary  autoregression. 

The  modeling  procedure  is  both  automatic  and  flexible.  In 
particular,  two  model  orders  are  determined  for  Y(t)  and  we  would 
recommend  computing  and  comparing  forecasts  from  both  models. 

This  paper  aims  to  illustrate  the  results  one  obtains  by 
typical  graphs,  and  to  describe  the  time  series  model  types  that 
one  should  expect  to  encounter  when  dealing  with  many  economic  time 
series. 

2.  Iterated  Models  Approach  to  Time  Series  Analysis 

The  problem  of  forecasting  future  values  of  a  time  series  from 
observations  of  its  past  values  has  an  extensive  literature  which 
propose  many  different  approaches.  The  approach  adopted  here  aims 
to  fit  automatically  to  a  time  series  sample  not  one  but  several 
models.  The  class  of  models  considered  is  suitable  for  time  series 
modeling,  spectral  analysis,  and  forecasting  and  for  time  series 
encountered  by  researchers  in  the  physical  sciences,  engineering 
sciences,  biological  sciences,  and  medicine,  as  well  as  to  the 
social  sciences,  economics,  and  management  sciences. 

A  time  series  may  be  predictable  for  a  long  time  in  the  future 
or  only  over  a  limited  future.  We  say  the  former  has  "long  memory" 
and  the  latter  "short  memory".  A  time  series  with  long  memory 
requires  a  "non-stationary"  model  with  periodic,  cycle,  and  trend 
components.  A  time  series  with  short  memory  requires  a 
"stationary"  model  which  is  a  linear  filter  relating  the  time 
series  to  its  innovations  or  random  shocks.  The  linear  filter  is 
an  AR,  MA,  or  ARMA  filter  (autoregressive,  moving  average,  or  mixed 
autoregressive-moving  average). 

The  model  we  fit  to  a  time  series  Y(.)  is  an  iterated  model 

Y(t)  -O-  Y(t)  -O-*  e (t) 

If  needed  to  transform  a  long  memory  series  Y  to  a  short  memory 
series  Y,  Y(t)  is  chosen  to  satisfy  one  of  the  three  forms 


Y(t)  =  Y<t)  -  $(f)  Y(t-f)  , 

(1) 

Y(t)  =  Y(t)  -  Y(t-1 )  -  $2 

Y( t-2)  . 

(2) 

Y(t)  =  Y(t)  -  4. j  Y( t-x  -1)  - 

I2  Y(trf) 

(3) 

Usually  Y(t)  is  short  memory;  then  it  is  transformed  to  a  white 
noise,  or  no  memory,  time  series  £(t)  by  an  approximating 
autoregressive  scheme  AR(m)  whose  order  m  is  chosen  by  an  order 
determining  criterion  (we  use  CAT,  introduced  by  Parzen 
(1971*),  ( 1977)) - 

In  the  present  study,  Y(t)  was  found  to  be  always  short 
memory.  Tt  i3  then  modeled  by  a  stationary  autoregressive  scheme. 
T‘  is  argued  by  Parzen  that  approximating  AR  schemes  suffice  for 
spectral  analysis  and  forecasting.  Only  for  model  interpretation 
is  it  desirable  to  fit  an  ARMA  scheme.  In  the  present  study  not 
more  than  15  percent  of  the  time  series  could  be  regarded  as 
requiring  an  ARMA  scheme. 


To  determine  the  best  lag  f  ,  we  use  non-stationary 
autoregression;  either  fix  a  maximum  lag  M  and  choose  t  as  the 
lag  minimizing  over  all  t 


I  { Y  (t)  -  4>  (t)  Y(t-r)}z 
t-M+1 

or  choose  x  a3  the  lag  minimizing  over  all  t 

T  T 

I  { Y  (t)  -  4>(t)  Y(t-x)}2  i  Z  Y2(t) 


t=T  +  l 


t=T+l 


For  each  t,  one  determines  $(x),  and  then  one  determines  t  (the 
optimal  value  of  t)  as  the  value  minimizing 

T  T 

Err  (x)  =  Z  { Y (t)  -  $(x)  Y(t-x)}2  *  I  Y2(t) 


t=M+l 


t=M+l 


T 

Err  (X)  =  Z  { Y (t)  - 
t— x  + 1 


4>  (x )  Y(t-x)}4 


T 

Z 

t=T+l 


Y2(t) 


The  decision  as  to  whether  the  time  series  is  long  memory  or  not  is 
based  on  the  vaiue  of  Err(x).  An  adhoc  rule  we  use  is  if  Eff (t >  < 
8/T,  the  time  series  is  considered  long  memory.  In  the  present 
study  all  time  series  were  fudged  to  be  long  memory  by  this 
criterion.  When  this  criterion  fails  one  often  seeks 
transformations  of  the  form  of  (2)  or  (3)»  using  semi-automatic 
rules  described  in  the  appendix. 

For  the  maximum  lag  M  of  non-stationary  autoregression,  the 
following  rules  were  adopted  in  this  study:  M  a  2  for  yearly 
series,  M  =  5  for  quarterly  series,  M  =  15  for  monthly  series. 

?.  Forecasting  Formulas 

For  forecasting  purposes  it  suffices  to  adopt  for  Y(t)  a 
stationary  autoregressive  model  of  suitable  order  m  whose 
coefficients  ,  ...,  are  estimated  by  Yule  Walker  equations  in 
the  correlation  function  p (v)  of  Y(t).  In  this  paper  the  model 
adopted  for  all  time  series  was  of  the  form 

Y(t)  =  Y(t>  -  $(t)  y (t-T } 

Y(t)  +a,  Y(t-l)  +  ...  +a  Y(t-ro)  «=c  (t) 

1  a 

The  residual  variances  are  denoted 


T 

RVY  -  Z 

Y2(t) 

T 

*  ■  Z 

Y2(t) 

t=M+l 

t-M+1 

T 

RVYT  *  I 

e2(t) 

T 

i  Z 

Y2(t) 

t=x+l 

t-T+1 

The  last  18  points  of  the  graphs  of  Y  and  Y  represent  not 
observed  values  of  these  series  but  forecasted  values  of  horizons  h 
=  1  to  18.  The  mathematical  procedure  by  which  they  are  derived  is 
as  follows. 


Let 

YU (t+h|t)  -  E{y (t+h) |  Y (t) /  Y(t-l),  ...} 
denote  the  predictor  of  Y(t+h)  given  values  Y(t),  Y(t-1),  ...  . 

From  the  equation 

Y  (t+h)  =  4>{t)  y  (t-r+h)  +  Y  (t+h) 
one  obtains,  by  condiMoning  with  respect  to  Y(t),  Y(t-1),  ... 

Y^  (t+h  1 1)  =  4>  (t)  YP(t-T+h|t)  +  YV(t4h|t) 

To  obtain  a  formula  for  forecasts  of  ?  when  we  have  fitted  an 
AR(m)  to  Y: 

Y(t)  +  a.  Y(t-l)  +  ...  +  a  Y(t-m)  =  c It) 

+  m 

write 

Y (t+h)  +  a,  Y (t+h-1 )  +  ...  +  a  Y(t+h-m)  =  e (t+h) 

1  m 

YP(t+h|t)  +  a  Yy (t+h-1 | t)  +  ...  +  a  YW(t+h-m|t)  =  0 

1  ID  1 

One  can  now  compute  Y^(t+h/t)  recursively  for  h  =  1,  2,  ..., 

using  the  fact  that 

Y^t+jlt)  =  Y(t+j)  if  j_<  0 

For  example, 

-YW(t+l|t)  =  n  Y (t)  +  ...  +  a  Y (t-ro+1 ) 

ID 

Then  one  can  compute  Yy(t+h/t)  recursively  for  h  =  1,  2,  ...  using 
the  fact  that 

Yw(t+j|t)  =  Y(t+j)  if  j<_  0 

For  large  values  of  h,  one  expects  Yu(t+h/t)  =  0.  Then 
Yw(t+h|t)  *  <J>(T)  Yv(t+h-jt) 

When$(x)  >  1,  this  does  not  damp  down  to  zero,  and  provides  the 
long  term  predictability  apparent  in  many  of  the  series. 

Summary  of  Iterated  Models  Fitted  to  111  Time  Series 
Table  I  describes  the  lags  of  the  most  significant  lag 
non-stationary  scheme  for  Y(t).  For  60%  of  the  monthly  series, 
the  annual  period  (x  =  12)  was  most  important;  only  26 

percent  of  the  quarterly  series  had  an  annual  period  (x: 

The  AF  character  of  the  residual  series  Y(t). are  described  in 
Table  II.  Order  m  =  0  indicates  white  noise  (or' no  memory);  60  % 
of  the  yearly  series  obey  the  "naive"  model  Y(t)  s  e(t),  white 
noise. 

Table  III  lists  the  names  of  33  series  arbitrarily  chosen  from 
the  set  of  111  series  to  represent  typical  series.  We  select  this 
small  number  of  series  to  discuss  in  detail.  The  different  types 
of  time  series  which  can  be  diagnosed  by  our  approach  to  time 


Table  I.  Lag  of  Non-stationary  AR 
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aeries  modeling  and  forecasting  are  illustrated  by  the  results  in 
Table  IV  and  the  graphs  of  Y  and  ?  for  the  series  listed  in  Table 
III. 

Table  IV  summarizes  the  basic  model  diagnostics  of  a  time 
series  Y(t).  These  are  length;  most  significant  non-stationary 
autoregressive  lagT,  and  coefficients  $(t);  the  residual  variance 
RVY  of  this  non-stationary  AR  scheme;  the  best  orders  (denoted  CAT 
1  and  CAT  2)  of  approximating  AR  schemes  for  Y(t),  their  horizons 
HOP  1  and  HOR  2,  and  the  residual  variance  RVYT  of  the  best 
approximating  AR  Scheme. 

Some  ARMA  models  for  quarterly  time  series  were: 

OA  Y  =  (I-1.04L4)Y,  (I-.7JJL)Y  =  (I-.85L4)e 
OH  Y  r  (I-1.02LJY,  (I-.29L4 )Y  =  (I-.38L3)£ 

Some  ARMA  models  for  monthly  time  aeries  were: 

MA  Y  =  (I-1.02L12)Y,  (I-.JJ1U-.32L 12 )Y  =  (I-.U2L+.31L5)e 

MF  Y  =  (I-.97L)Y,  (I+.31L10)Y  =  (I-.J»9)e 

MJ  Y  =  (I-1.08L)Y,  (I-.75L-.21L3)Y  =  (I-.5JJL12)e 

MN  Y  =  (1-1 .04L12  )Y,  (I-.29L2-.28L3-.27Ln  +  .30L13)Y  =  (I-.JJ2Ll2)£ 

MR  Y  r  (I-1.05L12)Y,  (I-.21L5-.JJ1L6)Y  =  (I-.55L12)e 

Table  III.  Typical  Series  for  Detailed  Discussion 
(Y,0,M  are  the  prefixes  of  Yearly, 

Ouarterly  and  Monthly  Series  Respectively). 

YA  Machinery  and  Equipment  (YAC  17) 

YB  National  Product  and  Expenditure-Residential 
Construction  (YAC  26) 

YC  Population  Movement  Male  Death  (YAD  6) 

YD  Crude  Birth  Rates  (YAD  15) 

YE  Deaths,  Analysis  by  Age  and  Sex,  All  Ages, 

United  Kingdom  (YAD  2iJ) 

OA  Industrial  Production:  Textiles  (0NI1) 

OB  Industry  Germany  (ONI 10) 

OC  Company  Data  Germany  (0NM15) 

0D  Company  Data  (0NM6) 

OF  Industrial  Production:  Durable  Manufactures  (0RC13) 

OF  Industrial  Production:  Total  Austria  (QRC22) 

0G  Value  of  Manufacturer's  New  Orders  for  Consumer  Goods  (QRCiJ) 

OH  Per  Capita  GNP  in  Current  Dollars  (QRG13) 

0T  Total  Industrial  Production  (ORGJJ) 

MA  Company  Data  (MNB11) 

MB  Company  Data  (MNP2) 

MC  Company  Data  (MNB20) 

MD  Company  Data  (MNB29) 

ME  Company  Data  USA  (MNB  38) 

MF  Company  Data  UK  (MNB97 ) 

MG  Company  Data  (MNB56) 

MH  Company  Data  (MNB65) 

MI  Textiles  -  Quoted  at  Paris  Stock  Exchange  (MNC17) 

MJ  General  Index  of  the  Industrial  Production  (MNC26) 

MK  Reserves  -  Danemark  (MNC35 ) 

ML  New  Private  Housing  Units  Started  Total  USA  (MNCM) 

MM  Industrial  Production  Spain  (MNG28) 

MN  Industrial  Production:  Finished  Investment 
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Goods  Austria  (MNG37 ) 

MO  Aluniniun  Production  Netherlands  (MNI103) 
MP  Lead  Production  Canada  (MNI122) 

MO  Production  Tin  Thailand  (MN122) 

MR  Industrie  France(MNI13) 

MS  Motor  Vehicles  Production  Canada  (MNI131) 


Table  IV.  Diagnostics  of  Model  Types  of  Typical  Time  Series 
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Appendix 

UNIVARIATE  TIME  SERIES  MODELING  AND  FORECASTING 
AUTOMATIC  APPROACHES  USING  ARARMA  MODELS 


The  model  we  propose  fitting  in  general  to  a  time  series  Y(t) 
is  an  iterated  model  (with  symbolic  transfer  functions  G  and  g^,) 


Y(t) 


Y(t) 


-LgT]— > 


e(t)  white  noise 


where  Y(t)  is  the  results  of  a  "memory  shortening"  transformation 
chosen  to  transform  a  long  memory  time  series  to  a  short  memory 
one,  and  g«ds  an  innovation  filter  which  is  either  an  approximating 
AR  filter  or  an  ARMA  filter.  Parzen  (1982)  introduces  the 
terminology  ARARMA  scheme  for  the  iterated  time  series  model  with  G 
determined  by  a  non-stationary  autoregressive  estimation  procedure; 
an  ARIMA  scheme,  introduced  by  Box  and  Jenkins  (1970),  corresponds 
to  a  pure  differencing  operator  for  G.  Autoregressive  analysis  by 
Yule-Walker  equations  yields  a  stationary  autoregressive  scheme;  a 
non-stationary  autoregressive  scheme  is  one  which  is  fit  by 
estimating  its  coefficients  by  ordinary  least  squares. 

To  identify  the  final  model,  or  "overall  whitening  filter",  of 
a  time  series,  one  should  determine  its  model  memory  type,  and 
identify  an  iterative  model  for  the  time  series: 


|  IDENTIFY  TIME  SERIES  MEMORY  TYPE 

No  Memory 
(White  Noise) 
(Unpredictable) 

Short  Memory 
(Stationary) 
(Partially 
Predictable) 

Long  Memory 
(Non-stationary) 
(Predictable) 

t 

* 

> 

St 

V 

op 

Identify 

Whitening  Filter 
as  AR (p) ,  MA(q), 
or  AFMA(p.q) 

Identify 

Gentle 

Transforma  tion 
to  Short  Memory 
Time  Series  Y 

Estimate 

1  !  Model  Y  by  1 

Parameters 

_ 1  [Whitening  Filteq 

No  Memory 

▼ 

No  Memory 

Residuals  c 

Residuals  e 

A  confirmatory  theory  of  statistical  inference  is  available 
only  for  short  memory  time  series  (which  are  ergodic).  The 
modeling  of  a  short  memory  time  series  by  a  whitening  filter  can  be 
regarded  as  a  science,  and  it  can  be  made  semi-automatic.  Given  a 
sample  of  short  memory  stationary  time  series  ?(t),  our  modeling 
procedure  in  the  time  domain  is  to  compute  approximating 
autoregressive  schemes. 
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1.  Form  the  sample  correlation  function 

T-v,  _  t 

Mv)  -  £  Y(t)  Y  (t+v)  £  Y2(t) 

t-1  t=l 

but  do  not  base  any  decision  upon  it,  or  upon  the  partial 
correlations.  Rather,  compute  approximating  autoregressive 
schemes. 

2.  Solve  successive  order  m  =  1,  2,  ...  Yule  Walker 

equations  2 for  autoregressive  coefficients  a.  ,...,a  and  residual 
variance  o_.  lm  mm 

3»  use  an  autoregressive  order  determining  criterion  (either 
CAT  or  AIC)  to  determine  ri(  1 )  and  m(2),  the  best  and  second  best 
orders  of  approximating  autoregressive  schemes. 

U.  Compute  PVH(h),  the  prediction  variance  horizon  function 
for  the  Insight  it  provides  on  the  memory  type  and  ARMA  type  of  the 
time  series.  Compute  horizons  HOR  1,  HOR  2  using  approximating  AR 
schemes  of  orders  m( 1 )  and  m(2). 

5.  Compute  a  subset  AR  model. 

6.  Compute  a  subset  ARMA  model. 

One  can  also  compute  various  spectral  density  functions  and 
spectral  distribution  functions  if  one  would  like  the  additional 
insight  of  the  spectral  domain. 

The  diagnosis  of  a  time  series  as  being  long  memory  can  be 
made  semi-automatic.  Many  criteria  are  available  to  diagnose  time 
series  memory  type,  using  (1)  correlations,  (2)  spectral  densities, 
(?)  autoregressive  prediction  variances,  (4)  prediction  variance 
horizon  function,  (5)  spectral  distribution  functions,  and  (6) 
S-PLAY  diagnostics.  The  definitions  below  are  given  in  terms  of 
population  parameters,  assuming  a  stationary  time  series.  In 
practice,  the  diagnosis  is  based  on  sample  analogues  of  these 
parameters. 

The  prediction  variance  horizon  PVH(h),  h  =  1,  2,  ...,  is 

defined  in  terms  of  the  normalized  mean  square  prediction  error  of 
infinite  memory  prediction  h  steps  ahead: 

°h,~  =  E(|YV(t+h|t)|2}  * E{Y2(t) },YV(t+h|t)=Y(t)-yP(t+hJt)  , 

YW(t+h|t)  =  E{Y (t+h) ] Y (t) ,  Y(t-l), 

A  formula  for  ^  is  obtained  by  introducing  the  MA  (®  ) 
representation  of 

Y(t)  =  e(t)  +  BjE(t-l)  + .  Then 

w  =  o2(l  +  62  +  ...  +  82  } 

h,®  ®  1  h-1 

2  2 
The  graph  of  — increases  monotonically  from  at  h  =  1  to  1 

as  h  tends  to»  /  We  define 

PVH(h)  =  1  -  o2  ,  h  -  1,  2,  ... 
n ,® 

and  define  horizon  HOR  to  be  the  smallest  value  of  h  for  which 

PVH(h)  <  0.05  (whence  o2  >-95). 

h,« 
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The  Infinite  moving  average  coefficients  are  estimated  by 
inverting  the  transfer  function  gm(z)  of  an  approximating 
autoregressive  scheme  to  obtain,  for  k  =  1,  2,  ... 


°06k  *  Vk-l  * 


*  "kB0 


0 


The  classification  of  memory  type  by  prediction  horizon  HOR 
is: 


No  Memory 

Short  Memory 

Long  Memory 

HOR  =  0 

0  <  HOR  <  » 

HOR  =  - 

By  HOR  =  «®,  we  mean  HOR  is  comparatively  large:  experiments  lead 
us  to  conclude  that  one  should  compare  HOR  with  the  order  ORD  of 
the  approximating  autoregressive  scheme.  Let  HOR/ORD  denote  the 
ratio  of  HOR  to  ORD;  identify  time  series  as  follows:  If  HOR/ORD 
11,  then  MA(q) ,  with  q  1H0R-1.  If  HOR/ORD  1  l»(say)  and  PVH 
decays  slowly,  then  long  memory.  If  PVH  declines  smoothly  and 
exponentially,  then  an  AR(p)  is  indicated.  If  PVH  has  "bends", 
then  ARMA.  If  PVH  has  many  level  stretches  with  period  t,  then  an 
ARMA  model  is  indicated  of  the  form 


Y(t) 


2  q 

I+8,L+6„L  +  . . .+B 

i  2—  -  q  - 


I-axL 


c  (t) 


The  final  Identification  of  the  orders  p  and  q  should  be  by 
parameter  estimation  or  by  use  of  S-arrays. 

The  determination  of  most  appropriate  "gentle"  transformation 
of  Y  to  Y,  where  Y  is  long  memory  and  Y  is  short  memory  must 
inevitably  involve  the  physical  nature  of  the  observed  time  series. 
A  semi-automatic  approach  can  be  developed  by  considering  the 
following  examples  of  long  memory  time  series. 

A  time  series  Y(t),  t  =  0,  +  1,  . . . ,  is  called  periodic  with 
period  T,if 


Y(t+x)  -  Y(t)  =  0,  all  t. 


It  follows  a  linear  trend  Y(t)  =  a+bt,  if  for  all  t 


Y( t+1 )  -  Y(t)  =  b,  a  constant 


It  is  a  pure  harmonic  of  period  t  if  for  all  t 

Y(t)  -  <j>Y(t-1)  ♦  Y(t-2)  =  0,  ♦  =  2  cos  . 

T 

Then 

Y(t)  =  A  cos  2*  t  ♦  B  sin  2n_  t 

r  T 

As  gentle  memory  shortening  transformations,  it  is  natural  to 
consider 


Y(t)  =  Y(t)  -$(f)  Y(t-f), 


(1) 


1 


f(t)  *  T(t)  -♦jYtt.-i )  -*  y(t-2) 
y(t)  «  y(t)  -#}  y(t-(B-iy>  -$2Y(t-m) 


whose  coefficients  t  ,  $(t),  are  determined  adaptively  from 

the  data.  Our  first  choice  is  (1);  the  lag  f  is  chosen  to 
minimize  over  t 


Err  (t) 


T 

z 

t-T+1 


2  2 

{y  (t)  -  $(t)  Y(t-T))  i  Z  Y  (t) 

t«T+l 


and  K  t)  is  chosen  to  minimize  over  $(t) 

T 

I  {Y(t)  -  ♦(T)  y (t-T ) } 2 

t**T+l 

The  stationary  correlation  funetionp(T)  of  (Y(t),  t  =  1,  2, 

. ..,  T)  is  defined  by 


Define 


SSQ(v) 


I  Y<t)  Y (t+r)  i  l  Y  (t) 


Z  Y  (t) 
t=l 


One  can  show  that 


$  (t)  -  C  (t) 


SSQ (T) 
SSQ(T-t) 


Etr(,J  -  1-|*!T)|2  IgmlsSor.' 


. .  SSQ  (T)  -SSQ  (t  ) 

The  most  significant  lag  fis  defined  as  the  value  minimizing  Err 

(T). 

We  propose  three  possible  actions  at  the  Initial  stage  of 
analysis  of  a  time  series  (Y(t)f  t  r  lf  ...,  T): 

L.  Declare  time  series  to  be  long  memory , 
and  form  Y(t)  by  (1) 

M.  Declare  time  series  to  be  moderately  long 
memory,  and  form  Y(t)  by  (2). 

S.  Declare  time  series  to  be  short  memory 
^and  form  Y(t)  *  Y(t),  or  Y(t)  =  Y(t)  -  Y_ 
where  Y  is  the  sample  mean.  After  computing  T,  one  performs  a 

naive  test  to  decide  if  it  should  be  set  equal  to  0;  a  naive  test 

is  If  li  2o/A  where  a  is  the  sample  standard  deviation. 

1.  Compute  and  print  #(  r)  and  Err  (t)  for  -t  si,  2,  ...»  M  , 
where  H  is  suitably  chosen  (15  for  yearly,  quarterly,  or  monthly 
data): 

2.  Determine  t  .  If  Err(f)  £  8/T,  go  to  L. 

3.  If**)  £.9,  and f  >  2,  go  to  L. 

*.  If#  (t)£  .9  and  f  i  1  or  2  determine  the  best  fitting 

non-stationary  All (2)  scheme  minimizing.. 


I 
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r  {y (t)  -  <p  Y(t-i)  -  ♦  y(t -2)}2 
t  -3 

Let  $  ,  |  denote  the  minimizing  values  of  *  and  a  .  Then  go  to 
H.  1  2  1  2 

5.  If  ♦(♦  )  <  .9  go  to  S. 

6.  If  4 (t )  is  approximately  1  for  some  r ,  one  may  set  this 
value  of  t  equal  to  t  and  go  to  L.  One  compares  the  stationary 
analysis  of  this  choice  of  memory  shortening  transformation  with 
that  determined  by  the  value  of  T  minimizing  Err  (t). 

7.  Hon-stationary  prediction  analysis  of  a  time  series  In 

general  finds  coefficients  ^minimizing  (for  a  specified 

memory  m) 

T  2 
E  { Y (t)  -  ♦,Y(t-l)  -  ...  -♦  Y(t-m)} 

t=m+l  1  m 

We  recommend  a  subset  regression  solution  which  attempts  to 

determine  the  most  significant  lags  J  ,  . ..,  J  minimizing 

1  m 

T  2 
E  (Y  (t)  -  A  Y  (t- j  )  -  ...  -  A  Y(t-j  )'/ 

t=m+l  "M  3n  n 

and  determines  the  solution  for  a  specified  set  of  lags  Jj  ,  . .., 

.1  .  One  may  take  n  *  2,  and  J5  and  J-,  are  two  adjacent  lags  (m-1 
and  m)  for  which  4  (t  )  is  approximately  ‘l;  one  then  obtains  the 
transformation  of  type  ( 3 ) - 

A  model  frequently  fitted  to  monthly  economic  time  series  is 
the  so-called  "airline"  model  (see  Parzen  (1979)): 

(I-L)(I-L12)  Y(t)  =  (I-OjU  (I-012L12)  c  (t) 

It  seems  doubtful  that  this  model  would  be  Judged  adequate  by  our 
criteria,  which  proposes 

Y(t)  =  (I-$(12)L12)  Y (t) 
g]3(L)  Y (t)  =  eft) 

If  one  desires  a  parsimonious  ARMA  model  for  Y(t)  it  may  be  given 
by  . 

Y(t)  +  OjYtt-l)  +  a12Y(t-12)  +  a13Y(t-13)  =  eft) 

or 

Yft)  +  cXjYft-l)  +  a2Y(t-2)  =  eft)  +  B12c(t-12) 

It  should  be  noted  that  double  differencing  is  not  recommended 
by  us  as  a  memory  shortening  transformation.  When  the  need  for 
double  differencing  arises,  it  appears  as  a  situation  in  which  long 
memory  components  continue  to  be  present  even  after  several 
iterations;  then  the  final  Iterated  model  is  of  the  form 

Y(t)  -O-  Y(1)<t)  — O— *  Y{2)(t)  — O— *  e  (t) 


mutirnmm 
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An  Iterated  filter  model  provides  not  only  forecasts  and  spectral 
analysis,  but  also  model  Interpretation. 
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Graphs  of  Y  and  Y  (denoted  YT)  for  the  33  tines  series  listed  in 
Table  III.  The  break  in  the  graphs  indicates  the  end  of  the 
observed  values  of  the  time  series  and  the  beginning  of  predictions 
of  the  next  18  values. 


•  National  Product  &  Expenditure-Residential 

Construction  (YAC  26) 


I  Machinery  R  Equipment  (YAC  17) 
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Deaths,  Analysis  by  Age  &  Sex, 


All  Ages,  UK 

(YAD  24) 


YT 


UK 

(YAD  24) 


'  Crude  Birth  Rates  (YAD  15) 


Crude  Birth  Rates  (YAD  15) 


Industrial  Production:  Durable  Manufactures  (QRC  13) 


Total  Industrial  Production  (QRG  4) 


Total  Industrial  Production  (QRG  4) 


Y 


Per  Capita  GNP  In  Current  Dollars  (QRG  13) 


Per  Capita  GNP  In  Current  Dollars  (QRG  13) 


New  Private  Housing  Units  Started  Total  USA  (MNC44) 
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"HOW  TO  LEARN  FROM  THE  JOF  COMPETITION" 
by 

Emanuel  Parzen  and  H.  J.  Newton 
Institute  of  Statistics 
Texas  A&M  University 


The  significance  of  the  "forecasting  competition"  is  best  illustrated  by 
comparing  it  to  horse  racing.  One  may  distinguish  two  main  types  of  people 
at  the  race  track.  Type  A  are  bettors;  they  go  to  the  track  to  bet  on  the 
outcomes  of  the  races  and  are  concerned  only  with  predicting  winners.  Type  B 
are  lovers  of  knowledge;  they  go  to  enjoy  the  beauty  of  the  horses  (and 
perhaps  believe  that  the  purpose  of  horse-racing  is  improvement  of  the  breed!), 
and  are  satisfied  with  watching  the  race. 

From  a  forecasting  competition,  Type  A  people  want  to  know  who  won, 
which  was  not  explicitly  reported  in  Makridakis  et  al  (1982) .  The  JoF 
Competition  merits  publication  as  a  report  of  raw  summaries  of  the  results. 
Realistically,  the  authors  are  not  likely  to  take  any  action  which  implies 
that  half  of  its  members  are  below  average.  It  is  appropriate,  and  desirable, 
to  have  subsequent  papers  that  analyze  and  interpret  the  results  of  the 
forecasting  competition.  We  thank  the  authors  who  have  provided  commentaries 
in  this  issue  for  the  enlightenment  that  they  have  provided. 

Our  approach  to  the  forecasting  process  is  based  on  the  belief  that  a 
forecasting  procedure  should,  in  addition  to  forecasts,  provide  knowledge 
about  the  "information"  in  the  time  series.  Important  aspects  of  information 
are  modern  versions  of  the  classic  idea  that  a  time  series  can  be  usefully 
decomposed  into  trend,  seasonal,  and  covariance-stationary  irregular. 

Parzen  (1981)  states  that  the  first  step  in  analysis  of  a  time  series  is  to 
determine  its  "memory".  "Short  memory"  corresponds  to  a  covariance-stationary 
time  series  for  which  there  are  available  semi-automatic  model  identification 
criteria  for  fitting  AR,  MA,  and  ARMA  schemes  which  transform  the  "short 
memory"  time  series  to  a  "no  memory"  time  series  (white  noise) .  "Long  memory" 
contains  trend  and  seasonal  components  which  one  seeks  to  model  by  regression 
(on  other  series  or  on  deterministic  functions)  or  non-stationary  autoregression 
on  its  past  (the  first  AR  in  ARARMA) . 

It  is  our  experience  that  the  transformation  of  a  long  memory  time  series 
to  its  "no  memory  form"  has  the  following  "uniqueness"  property:  if  ej(t)  and 
t2 (t)  are  the  white  noise  residual  time  series  of  two  different  methods  of 
decomposition,  then  C}(»)  and  C2(*)  are  approximately  Identically  distributed. 
One  usually  can  conceive  of  several  ways  of  transforming  long  memory  time 
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series  Co  a  short  memory  time  series;  the  optical  transformation  is  not  a 
statistical  matter,  but  depends  on  how  the  final  overall  model  is  to  be 
applied  and  interpreted. 

Automatic  AR  and  ARMA  model  identification  algorithms  can  be  used  to 
generate  analytically  several  models  (called  "best"  and  "second  best"),  and, 
thus,  forecasts,  based  on  the  information  contained  in  past  data. 

Forecasters  should  devise  systems  for  comparisons  of  forecasts  generated 
by  different  procedures  on  the  time  series  of  interest  to  their  organization, 
rather  than  relying  on  comparisons  of  other  time  series.  The  publication  of 
such  case  studies  should  be  encouraged. 

Our  approach  to  time  series  analysis  is  used  in  the  T1MESB0ARD  library 
of  time  series  analysis  mainline  programs  and  computer  subroutines  {Newton 
(1982)}.  TIMESBOARD  provides  tools  for  a  decision-maker  seeking  forecasting 
models  developed  by  identifying  the  information  and  memory  in  the  time  series. 
Our  program  DTFORE  produces  several  sets  of  forecasts  for  each  time  series. 
Each  set  is  optimal  in  a  statistical  sense,  depending  on  how  the  forecaster 
desires  to  interpret  the  diagnostics  concerning  information  and  memory  of  the 
series.  For  example,  faced  with  the  problem  of  forecasting  a  series  that  is 
undergoing  explosive  growth,  one  can  obtain  a  set  of  forecasts  for  continued 
growth,  for  leveling  off,  and  for  decline.  The  forecaster,  together  with 
the  decision  maker,  can  decide  which  method  to  use.  Of  course,  the  rules  of 
the  competition  demanded  that  we  produce  a  single  set  of  forecasts  for  each 
series.  This  was  done  automatically. 

The  question  remains,  then,  how  to  improve  the  results  of  the  JoF 
Competition.  We  have  two  suggestions. 

First,  produce  plots  of  the  various  forecasts  appended  one  above  the 
other,  together  with  the  true  future  values.  Obviously,  publishing  such  a 
graph  for  1001  series  is  impractical.  However,  a  representative  sample  of 
each  type  could  be  published. 

Secondly,  forecasting  methods  are,  in  our  opinion,  best  compared  by 
forming  the  time  series  of  forecast  errors  and  studying  them.  An  approach 
to  studying  distributions  of  errors  are  the  quantile  and  functional 
statistical  inference  methods  being  developed  by  Parzen  (1979)  that  compute 
medians,  inter-quart ile  ranges,  and  various  measures  of  distributional  shape. 
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